home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Tech Arsenal 1
/
Tech Arsenal (Arsenal Computer).ISO
/
tek-02
/
asorts.zip
/
ASORTS.MAN
< prev
next >
Wrap
Text File
|
1993-01-04
|
21KB
|
524 lines
ASORTS
General-purpose routines for sorting, searching and moving
arrays of arbitrary elements.
Copyright 1991, by J. W. Rider
ASORTS provides the Turbo Pascal programmer with type
definitions, functions and procedures to handle a variety of
array sorting and searching tasks. Several of these are
analogous to functions of the same name in the C programming
language:
qsort -- sort an array
bsearch -- do a binary search of a sorted array for an
element that satisfies some key
lfind -- do a sequential ("linear") search of an unsorted
(or sorted) array for an element that satisfies
some key
lsearch -- do a sequential search of an unsorted (or sorted)
array for an element that satisfies some key. If
the element is not found, then it is appended to
the end of the array.
(Non-C programmers: please note that "bsearch" and "lfind" -- not
"lsearch"! -- do the equivalent task for sorted and unsorted arrays,
respectively. It would make more sense to have "bfind" to search
through a sorted array, and make "bsearch" insert a missing element
into the array at the right location. However, these routines are
provided for compatibility in converting C programs.)
swab -- move one portion of memory to another, swapping
high and low order bytes in the process
There are some routines that are provided for utility above and
beyond what is available in standard C libraries:
bfind -- do a binary search of a sorted array for an
element that satisfies some key. (This differs
from "bsearch" in that is "bfind" reports the
"insertion index" if the key element is not
found.)
binsert -- do a binary search of a sorted array for an
element that satisfies some key. If the element
is not found, then the key element is inserted in
the proper location in the array to maintain the
sorted order.
fibsearch-- searches an ordered array using a "fibonacci" search
algorithm.
fill -- moves a single item into all of the elements of
an array
heapsort -- order an array by the "heapsort" algorithm.
longaddr -- returns the address of a variable expressed as a
longint value
naivesort-- a particularly inefficient sorting algorithm that the
author has been known to use when "qsort" was not
available. (The use of "naivesort" in applications is
NOT recommended!)
scan -- returns the index of the first element in an array
that meets a specific criterion.
selsort -- "selection" sorting, another sorting routine.
shellsort-- yet another sorting routine, different from the rest.
shuffle -- randomly permutes ("unsorts") the elements of an
array
subfill -- moves a single item into all of the elements of a
subarray of the base array
submove -- moves the elements of the source array into a
subarray of the destination array (or, the
elements of a subarray of the source to the
destination array)
swap -- Exchanges two array elements or variables.
(NOTE! The "Asorts.Swap" procedure replaces the
default "System.Swap" which simply exchanges the
high and low-order bytes of a two byte expression.)
vqsort -- a quicksort algorithm for "virtual" arrays. "VQSort"
does not presume any kind of structure inherent within
the "thing" being sorted. Instead, "VQSort" provides
the sorting logic to other routines that will perform
the actual comparisons and exchanges.
vselsort -- a "virtual" selection sort.
vshuffle -- a "virtual" shuffler.
xsubmove -- moves the elements of a subarray of the source
array into the elements of a subarray of the
destination.
While these routines are provided to be of assistance to the programmer,
the number of different searching and sorting algorithms does raise the
issue of how to select any one algorithm for the programmer to employ.
Unfortunately, that is very application dependent. For general purpose
array sorting, you would do well to compare the "qsort" and "shellsort"
routines on your actual data. For sorting of typed files, I'd suggest a
variant of "VSelSort".
CONCEPTS
The ARRAYs to be manipulated are passed as "untyped vars" to
these routines. (In the "Interface" section, these arrays are
called "base", "source" or "destination".) These routines will
treat the ARRAYs as if they were declared to be of type:
ArrayType = ARRAY [1..MaxArray] OF ElementType
WARNING! It is *very* important to avoid defining an array
so that the last byte is in a memory segment different from
the first byte. As long as you never declare an array larger
than 65519, or $10000-15, bytes, it should not be a problem.
Each ELEMENT of the ARRAY is presumed to be fixed size, and this
size must be passed to the routines. (In the "Interface"
section, if an ELEMENT needs to passed directly to a routine as
an argument, it is passed as an untyped var called "key".) Also,
the number of elements in the ARRAY must also be passed. For
instance, to fill an array of real numbers with 0:
var RealArray : array [1..10] of Real;
x : real;
x:=0;
fill(RealArray,10,sizeof(real),x);
A SUBARRAY is a "byte-slice" of an array. For instance, if
"ElementType" is an "array [1..8] of byte", then a "subelement"
would be any contiguous collection of bytes within the
element, like 3,4 and 5. The SUBARRAY would be the collection of
all of the subelements stored in an ARRAY. If "ElementType" is a
record of fields, then a "subelement" would be any contigous
group of fields.
For sorting and searching, a COMPARISON FUNCTION must be passed
to the routines. COMPARISON FUNCTIONs take two untyped vars,
return a longint value, and must be declared "far" at
compilation. (DIFFERENT! In C, only an integer-sized value is
returned.) For instance, to sort the array of real numbers
declared earlier:
function RealCompare(var a,b):longint; far;
begin
if real(a)>real(b) then realcompare:=1
else if real(a)<real(b) then realcompare:=-1
else realcompare:=0;
end;
qsort(realarray,10,sizeof(real),realcompare);
"Virtual" arrays are data structures whose elements can be accessed
indirectly by an index. For instance, information that is physically
stored in multiple arrays might be sorted by a key in just one of the
arrays. ASORTS provides a few routines for handling such "virtual"
arrays. "VQSort" and "VSelSort" will provide the sorting logic for
ordering the arrays. "VShuffle" will similarly "unorder" the array.
To sort an array of "DBRec" with respect to an array of integer
priorities, declare:
var Array1 : array [1..MaxDBRec] of DBRec;
Priority: array [1..MaxDBRec] of integer;
function ComparePriority(a,b:longint):longint; far;
begin ComparePriority:=longint(Priority[a])-Priority[b] end;
procedure SwapPriDBRec(a,b:longint); far;
begin asorts.swap(Priority[a],Priority[b],sizeof(integer));
asorts.swap(Array1[a],Array1[b],sizeof(DBRec)); end;
and sort the two arrays with:
vqsort(MaxDBRec,ComparePriority,SwapPriDBRec);
INTERFACE
function bfind(var key{:element}, base{:array};
length_base, sizeof_element:word;
f:comparefunc):longint;
Searches a sorted array for a "key" element. Return the index
of its location, or the negative of the index of the largest
element in the array that is smaller than the key (i.e., the
element that you want to insert the new element after).
function binsert(var key{:element}, base{:array};
length_base, sizeof_element:word;
f:comparefunc):longint;
WARNING: This routine overwrites memory if used incorrectly.
Inserts the key element into the correct position of an
ordered array. Unlike "lsearch", which only adds the key if
it's not already present, "binsert" ALWAYS inserts a new
element into the array. "Binsert" returns the index where the
element is inserted.
function bsearch(var key{:element}, base{:array};
length_base, sizeof_element:word;
f:comparefunc):word;
"Bsearch" attempts to locate the "key" element within the
previously SORTED array "base". If successful, the index
of the found element is returned; otherwise, 0 is returned
to indicate that the element is not present.
type comparefunc = function (var a,b{:element}):longint; {far;}
Declares the type of the comparison function to be passed to
sorting and searching routines. CompareFunc's are
user-defined functions that takes two arguments a and b. A
and B must be typeless in the declaration, but otherwise are
of the same type as the elements of the "base" array. For
"qsort" and "bsearch", the function needs to return a
negative integer if "A<B"; a positive integer if "A>B"; and 0
if "A=B". For "lfind" and "lsearch", the function needs to
return 0 if "A=B", and some non-zero integer if "A<>B".
function fibsearch(var key{:element}, base{:array};
length_base, sizeof_element:word;
f:comparefunc):word;
"Fibsearch" attempts to locate the "key" element within the
previously SORTED array "base". If successful, the index
of the found element is returned; otherwise, 0 is returned
to indicate that the element is not present. (This procedure
is included because a user asked about the algorithm on one
of Borland's CompuServe Forums. It looks like an interesting
alternative to "bsearch", but I have not run extensive
comparisons.)
procedure fill(var key{:element}, destination{:array};
count, sizeof_element:word);
WARNING: This routine overwrites memory if used incorrectly.
Moves the "key" element to the first "count" indices in the
"destination" array.
procedure heapsort(var base {:array};
length_base, sizeof_element:word;
f:comparefunc);
WARNING: This routine overwrites memory if used incorrectly.
"HeapSort" reorders the elements of the "base" array using a
heapsort algorithm (like, you expected something else?). The
function "f" is used to compare two elements of the array, and
must return a negative number if the first argument is "less
than" the second, a postive number if the first argument is
"greater than" the second, and zero if the two arguments are
"equal".
type icomparefunc = function (a,b:longint):longint;
Declares the type of the comparison function to be passed as
an argument for sorting "virtual" arrays. Instead of passing
the elements to be compared as is done for "comparefunc",
ICompareFunc's use the location of the elements.
function lfind(var key{:element}, base{:array};
length_base, sizeof_element:word;
f:comparefunc):word;
"Lfind" attempts to locate the "key" element within the array
"base". If successful, the index of the found element is
returned; otherwise, 0 is returnd to indicate the element is
not present.
function longaddr (var x): longint;
Returns the address of a variable expressed as a longint value
so that address can be used for comparisons, etc.
function lsearch(var key{:element}, base{:array};
length_base, sizeof_element:word;
f:comparefunc):word;
WARNING: This routine overwrites memory if used incorrectly.
Does a linear search of the "base" array, looking for the "key"
element. If the key element is found, "lsearch" returns the
index of the array. *** NOTE! *** Otherwise, the key element
is appended to the end of the array. It is the programmer's
responsibility to ensure that "sizeof_element" bytes are
available at the end of the array and that the count of
contained elements is adjusted. To avoid the append, use
"lfind" instead.
var monitor : procedure;
procedure nullmonitor;
"Monitor" and "NullMonitor" were debugging devices developed
in the process of putting together the ASORTS unit. They can
be optionally declared by defining a compilation variable
"MONITOR". Every time that the "Qsort" algorithm swaps a pair
of array elements, the "monitor" procedure is called. This
will allow the user to watch the progress of the sort. This
is of marginal practical value, and by default, these two
identifiers are not defined. Calling "NullMonitor" will turn
off monitoring even if the "monitor" procedure variable is
defined.
procedure naivesort(var base {:array};
length_base, sizeof_element:word;
f:comparefunc);
WARNING: This routine slowly overwrites memory if used incorrectly.
"NaiveSort" also reorders the elements or an array. However,
it does it more slowly than "QSort". The use of the procedure
is not recommended. It is provided for comparison only. (i.e.,
don't waste your time trying to improve upon it. It's only here
for comic relief.)
procedure qsort(var base {:array};
length_base, sizeof_element:word;
f:comparefunc);
WARNING: This routine overwrites memory if used incorrectly.
"Qsort" reorders the elements of the "base" array using a
quicksort algorithm. The function "f" is used to compare
two elements of the array, and must return a negative
number if the first argument is "less than" the second, a
postive number if the first argument is "greater than" the
second, and zero if the two arguments are "equal".
function scan(var source; count, sizeof_element:word;
f:testfunc):word;
"Scan" does a linear search of the "source" array and
returns the index of the first element that satisfies the
test function "f". If no element is found, "scan" returns 0.
procedure selsort(var base{:array};
length_base, sizeof_element:word;
f:comparefunc);
WARNING: This routine overwrites memory if used incorrectly.
"SelSort" reorders the elements of the "base" array using a
selection sort algorithm. This algorithm minimizes the number
of times that each element is moved, but will maximize the
of comparisons. For most applications, the comparisons take
more time than the exchanges, so you are not likely to want to
use this algorithm for ordinary array sorting. See also,
"VSelSort".
procedure shellsort(var base{:array};
length_base, sizeof_element:word;
f:comparefunc);
WARNING: This routine overwrites memory if used incorrectly.
"ShellSort" reorders the elements of the "base" array using a
sort routine first defined by an individual whose last name was
Shell. In concept, the algorithm is an insertion-type sort (as
opposed to exchange-type sort (of which "qsort" and "selsort"
are examples). This turns out to be a very efficient sorting
routine for in-memory sorting.
procedure shuffle(var base{:array};
length_base, sizeof_element:word);
WARNING: This routine overwrites memory if used incorrectly.
Randomly permutes ("unsorts") the elements of an array. The
SYSTEM "Randomize" procedure should be called at least once
in a program that shuffles an array.
procedure subfill(var key{:subelement}, destination{:subarray};
count, sizeof_key, sizeof_element:word);
WARNING: This routine overwrites memory if used incorrectly.
Partially fills an array with the "key". The address of
"Destination" should be the address of the first byte in the
subarray. The portion of the array outside of the subarray is
left unchanged.
procedure submove(var source{:[sub]array},
destination{:[sub]array};
count,
sizeof_source_element,
sizeof_destination_element:word);
WARNING: This routine overwrites memory if used incorrectly.
If the size of the source elements are smaller than that of
the destination elements, moves the first "count" elements of
the source into a subarray of the same size in destination.
If larger, only moves that portion of the source that will
fill the first "count" elements of the destination. If equal
in size, does a simple "move" of the bytes.
procedure swab(var source, destination; numwords:word);
WARNING: This routine overwrites memory if used incorrectly.
Moves the contents of source to the destination, swapping
high and low order bytes in the process. (Note: while this
is provided for C compatibility, the third argument is used
differently. In C, the third argument is the number of bytes
to move and is expected to be even. Here, the third argument
is the number of byte-pairs to move.)
procedure swap(var var1,var2; sizeof_element:word);
WARNING: This routine overwrites memory if used incorrectly.
"Swap" exchanges the contents of the variable "var1" with
the contents of the variable "var2". (Note: this is a
redefinition of the "swap" function defined in the System
unit.)
type swapproc = procedure (a,b:longint); {far;}
Declares the type of the exchange procedure that is passed as an
argument to "selsort" and other virtual exchange sort algorithms.
type testfunc = function (var a):boolean; {far;}
Declares the type of the criterion function to be passed to
the "scan" function. The actual TestFunc should expect an
array element to be passed through "a" and return true if the
element satisfies the criteria. Otherwise, it should return
false.
procedure vqsort(length_base:longint; f:icomparefunc; s:swapproc);
"VQSort" provides the logic to do a quicksort of an indexed
entity (a "virtual" array), but depends upon the user-defined
routines "f" and "s" to do the actual work of accessing specific
elements in the array, comparing and exchanging them. This
makes VQSort useful for sorting elements when they are stored
in something other than a single contiguous array.
procedure vselsort(length_base:longint; f:icomparefunc; s:swapproc);
"VSelSort" provides the logic to do a selection sort of an
indexed entity (a "virtual" array), but depends upon the
user-defined routines "f" and "s" to do the actual work of
accessing specific elements in the array, comparing and
exchanging them. As mentioned in the description for "SelSort",
the algorithm minimizes the number of exchanges of elements
required to put something into sorted order, but makes a large
number of comparisons to do so. This would make "VSelSort"
useful for something like sorting an external file of larger
records based upon integer keys stored in an array in memory.
procedure vshuffle(length_base:longint; s:swapproc);
Of course, what fun is there in being able to order virtual
arrays if we can't mix all the elements up again?
procedure xsubmove(var source{:subarray}, destination{:subarray};
count, sizeof_source_element,
sizeof_destination_element,
sizeof_subelement:word);
WARNING: This routine overwrites memory if used incorrectly.
Moves a subarray from the source to the destination. The size
of the subelements is presumed to be the same in both
subarrays. "XsubMove" does not check to make sure that the
sizes are consistent.
REFERENCES
Knuth, The Art of Computer Programming, Sorting and Searching.
Press, et al., Numerical Recipes.
Sedgewick, Algorithms.